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Abstract  A 

The  probabilistic  performance  of  a  number  of  algorithms  for  the  Satisfiabil¬ 
ity  Problem  (SAT)  has  been  investigated  analytically  and  experimentally  using  a 
constant-danse-sise  model  generating  n  clauses  of  k  literals  taken  from  r  variables 
as  well  as  a  constant-density  model  generating  n  clauses  containing  each  of  r  vari¬ 
ables  independently  with  probability  p.  In  the  case  of  the  constant-density  model 
one  algorithm  has  been  shown  to  solve  8AT  in  polynomial  time  with  probability 
approaching  I  aa  n  and  r  get  large  when  p  >  in (n)/r  and  another  has  been  shown 
to  solve  SAT  in  polynomial  time  with  probability  approaching  1  as  n  and  r  get 
large  when  p  <  ln(n)/(2r).  In  the  case  of  the  constant-danse-sise  model  the  unit 
clause  heuristic  has  been  shown  to  be  effective,  in  probability,  when  lmyr  t00  nfr  < 
t**(k- 1 ){(k  —  l)/(k  —  2))  *  *(fc  -  2 )/k  and  another  heuristic  has  been  shown  to  be 
effective,  in  probability,  when  limw,r^00  n/r  <  (l+ln(*-l))-2**(k-2)/k  for  k  >  S. 
When  k  s=  3  the  unit  clause  heuristic  with  the  next  variable  given  an 
which  satisfies  the  maximum  number  of  clauses  has  been  shown  effective,  in  prob¬ 
ability,  when  lim»,r_»w  nfr  <  3.  Experiments  have  shown  the  existence  of  other 
algorithms  which  perform  better,  in  the  probabilistic  sense,  than  the  algorithms 
analysed. 
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1.  Research  Otjecthre 

^  The  goal  of  this  research  is  to  develop  sad  analyse  algorithms  which  can,  in  some 
practical  sense,  solve  certain  NP-complete  problems  efficiently.  By  solve  we  mean 
determine  whether  a  solution  to  a  given  instance  of  an  NP-complete  problem  exists 
where,  for  the  problems  we  have  considered,  a  solution  is  an  assignment  of  values  to 
a  list  of  variables  which  cause  some  predicate  to  be  true.  We  do  not  consider  actually 
finding  solutions  when  they  exist  since  doing  so  adds  unnecessary  complexity  to  the 
statement  of  the  algorithms:  the  algorithms  we  consider  can  all  be  modified  to  find 
solutions  without  significantly  altering  performance.  NP-complete  problems  are 
found  in  Crytology,  Operations  Research,  Artificial  Intelligence,  Computer  System 
Design  and  many  other  areas.  There  is  no  known  algorithm  for  any  NP-complete 
problem  which  runs  in  time  bounded  by  a  polynomial  on  the  length  of  the  input 
(polynomial  time)  in  the  worst  case  nor  is  one  likely  to  be  found.  We  seek  algorithms 
which  solve  nearly  every  instance  of  specific  NP-complete  problems  In  polynomial 
time.  — - 

To  prove  an  algorithm  A  solves  nearly  every  instance  of  a  specific  problem  X  in 
polynomial  time  we  establish  a  probability  distribution  D(n)  on  instances  of  X  of 
*sise*  n  (referred  to  as  a  model)  and  then  show  that  A  solves  a  random  instance  of 
X  generated  according  to  D(n)  in  polynomial  time  with  probability  approaching  1 
as  n  approaches  infinity;  then  A  is  said  to  solve  X  efficiently  in  probability.  Usually 
the  proof  holds  only  under  certain  conditions.  Sometimes,  when  D(n)  is  such  as  to 
heavily  favor  the  generation  of  instances  with  solutions,  the  weaker  result  that  A 
•proves*  the  existence  of  a  solution  to  a  random  instance  of  X  In  polynomial  time 
with  probability  bounded  from  below  by  a  constant  greater  than  sero  Is  obtained 
instead;  then  A  is  said  to  solve  X  efficiently  with  bounded  probability.  Again,  the 
result  holds  only  under  certain  conditions  (one  condition  that  must  be  satisfied  is 
that  nearly  all  random  instances  generated  according  to  D(n)  have  a  solution).  The 
algorithms  that  we  consider  here  “prove*  the  existence  of  a  solution  by  repeatedly 
choosing  a  variable  and  an  assignment  to  that  variable  until  the  predicate  is  true:  at 
each  step  the  possible  choices  are  ranked  based  on  some  heuristic  and  the  top  ranked 
possibility  is  chosen,  fbr  the  kinds  of  algorithms  and  distributions  we  consider,  if 
A  solves  X  efficiently  with  bounded  probability  under  some  set  of  conditions  then 
we  may  regard  this  as  strong  evidence  that  the  Backtrack  algorithm,  using  the 
heuristics  of  A  to  decide  the  order  in  which  to  consider  variables  and  assign  values, 
solves  X  efficiently  in  probability  under  the  same  set  of  conditions. 
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The  NP-complete  problem  we  are  primarily  interested  in  le  the  Satisfiability 
problem  (SAT).  An  instance  /  of  SAT  is  a  boolean  expression  In  conjunctive  normal 
form  and  a  solution  to  that  instance,  if  one  exists,  is  a  truth  assignment  to  the 
variables  in  I  which  cause  J  to  have  value  true;  such  a  truth  assignment  Is  said 
to  satisfy  J.  SAT  remains  NP-complete  even  if  all  disjunctions  contain  as  few  as 
three  literals.  SAT  is  closely  related  to  problems  in  Artificial  Intelligence  as  well 
as  other  NP-complete  problems.  Algorithms  which  solve  SAT  efficiently  in  some 
probabilistic  sense  will,  with  slight  modification,  probably  solve  other  NP-complete 
problems  efficiently  in  the  same  probabilistic  sense. 
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1.  Status  of  the  Research 

Although  there  has  been  a  significant  level  of  research  activity  in  this  area  no  one 
has  succeeded  in  getting  the  results  we  have  obtained  for  algorithms  designed  to 
solve  instances  of  SAT  efficiently  in  some  probabilistic  sense. 

Two  models  have  been  used  for  analysis:  one  is  a  constant-dause-aise  model  and 
the  other  is  a  constant-density  model.  According  to  the  const ant-clause-sise  model 
a  random  instance  of  SAT  contains  n  clauses  (disjunctions)  selected  independently 
and  uniformly  from  the  set  of  all  possible  disjunctions  containing  exactly  k  literals 
which  can  be  composed  from  r  boolean  variables  under  the  restriction  that  no  two 
literals  in  the  same  disjunction  are  associated  with  the  same  variable.  We  are 
interested  in  the  case  k>3  once  SAT  is  NP-complete  if  clauses  are  allowed  to  have 
three  or  more  literals.  Ibr  the  special  case  k  =  3  SAT  is  called  3-SAT.  According 
to  the  constant-density  model  a  random  instance  of  SAT  contains  n  clauses  each 
generated  independently  as  follows:  for  each  of  r  variables  (s)  place  into  the  danse, 
with  probability  p/2,  the  uncomplemented  literal  associated  with  the  variable,  (6) 
place  into  the  clause,  with  probability  p/2,  the  complemented  literal  associated  with 
the  variable  and  (c)  place  neither  complemented  nor  uncomplemented  literal  into 
the  clause  with  probability  1  —  p. 

The  following  two  algorithms  solve  SAT  efficiently  in  probability  under  the 
constant-density  model  when  n  and  r  are  polynomiality  related.  Let  /  be  a  random 
instance  of  SAT. 

Alt  Repeat 

Randomly  chooee  a  truth  assignment  t  for  the  variables  in  J 
Until  t  satisfies  / 

Output  ("a  solution  exists^) 

Alt  Search  /  for  a  null  clause 

If  a  null  clause  was  found  Then  Output (*no  solution  exists*) 

Else  Output(*cannot  determine  whether  a  solution  exists*) 


h  (3]  it  b  shewn  that  A!  solves  SAT  efficiently  in  probability  when  p  >  In (n)/r 
end  AS  correctly  determines  that  no  eolation  exists  for  a  random  instance  of  SAT 
with  probability  approaching  I  as  n  and  r  approach  infinity  when  p  <  ln(n)/(2r). 
Thus,  under  the  constant-density  model,  SAT  is  solved  efficiently  in  probability  by 
algorithms  Aland  AS  for  all  but  a  vanishingly  small  range  of  values  of  p  if  n  and  rare 
polynomial^  related  (it  is  easy  to  see  why  this  is  a  reasonable  restriction).  Although 
these  results  are  theoretically  interesting  they  have  little  practical  meaning  since  it 
is  unlikely  that  a  random  truth  assignment  satisfies  a  random  instance  of  SAT  and 
that  a  random  instance  of  SAT  contains  a  null  clause.  The  results  obtained  for  the 
constant-clause-rise  model  are  probably  more  meaningful. 

Assuming  the  const ant-clause-sise  model,  the  algorithms  below  solve  SAT  ef¬ 
ficiently  with  bounded  probability  under  various  conditions.  The  algorithms  are 
defined  using  the  following  symbols,  terms  and  functions.  Let  V  =  {vi,  vj,...,  vr}  be 
the  set  of  r  variables  from  which  clauses  are  composed  and  let  A  =  (vt,V|, ...,  tv,  Vr) 
be  the  set  of  Sr  literals  associated  with  V  (the  set  of  literals  contained  in  a  clause 
c  is  a  subset  of  L  such  that  for  all  1  <  %  <  r  both  v<  and  ty  do  not  appear 
in  c).  For  every  «  €  Lt  var(u)  is  the  variable  associated  frith  u  (for  example, 
var(vi)  =  oar(9|)  =  vj).  For  all  I  <  •  <  r,  €  A  and  Vt  €  L  are  said  to  be 
complementary  literals.  For  every  u  €  L%  eomp(u)  is  the  literal  in  L  which  is  com¬ 
plementary  to  %  (for  example,  eornpfV i)  =  vt).  If  a  clause  contains  only  one  literal 
it  is  called  a  unit  clause  and  a  unit  clause  may  be  regarded  as  bring  a  literal.  Let  J 
be  a  random  instance  of  SAT  generated  according  to  the  constant-dause-sise  model. 
Let  |  « |  denote  the  number  of  occurrences  in  /  of  literal  w. 

AS:  Repeat 

If  there  is  a  unit  clause  I  in  /  Then  « «-  / 

Else  choose  «  randomly  from  L 
Remove  from  /  all  clauses  containing  « 

Remove  from  /  all  occurrences  of  comp(«) 

L  *-  L-  («,comp(«)} 

Until  /  is  empty  or  there  exist  two  complementary  unit  clauses  in  / 

If  /  is  empty  Then  Output(<la  solution  exists*) 

Else  Output(*cannot  find  a  solution*) 


A4>  Repeat 

If  there  Is  a  unit  danse  fin  /  Then  a  «- 1 
Else  begin 

Choose  v  randomly  from  V 
If  |  v  |>|  V  |  then  « «-  v  else  u*-V 

End 

Remove  from  /  all  clauses  containing  a 
Remove  from  I  all  occurrences  of  eomp(u) 

V4-V-{v*r(u)} 

L*~  L  —  {a,comp(«)} 

Until  /  is  empty  or  there  exist  two  complementary  unit  clauses  in  J 
If  /  is  empty  Then  Output  ("a  solution  exists*) 

Else  Out  put  ("cannot  find  a  solution*) 

AS:  Repeat 

Let  e  be  a  smallest  clause  in  / 

Choose  a  randomly  from  c 

Remove  from  /  all  clauses  containing  a 

Remove  from  /  all  occurrences  of  comp(a) 

Until  /  is  empty  or  there  exist  two  complementary  unit  clauses  in  J 
If  /  is  empty  Then  Output("a  solution  exists*) 

Else  Output( "cannot  find  a  solution*) 

In  [4]  the  results  below  are  obtained  based  on  the  const  ant-dause-sise  model. 
In  this  analysis  the  parameter  k  is  assumed  fixed  and  n  and  r  ate  allowed  to  grow 
toward  oo. 

1.  There  exists  a  constant  d  and  a  constant  e  >  i  such  that  a  random  in* 
stance  of  SAT  has  a  solution  with  probability  approaching  1  as  n,r  -*  oo  if 
lim»,r-»oo  ntr  <  -d/  ln(l-2-*)  and  a  random  instance  of  SAT  has  no  solution 
with  probability  approaching  1  as  n,r  -*  oo  if  limw<p_oe  n/r  >  -e/ln(l-2~fc). 

2.  Neither  A1  nor  AS  solve  SAT  efficiently  In  probability  or  with  bounded  prob¬ 
ability  for  any  fixed  (function  of  k)  limiting  ratio  n/r 


3.  Algorithm  At  solves  8AT  efficiently  with  bounded  probability  if 


and  doe*  not  aohre  SAT  with  probability  approaching  1  when 

Jto.«/r>nr(!5i)  fork-i 

4.  Algorithm  A4  solve*  SAT  efficiently  with  bounded  probability  if 

lim  n/r  <  2.8  when  k  =  S 

»,r-*oo 

and  doe*  not  solve  SAT  with  probability  approaching  1  when 

lim  n/r  >  2.8  and  lb  =  3 

»,r-»oo 

5.  Algorithm  AS  solve*  SAT  efficiently  with  bounded  probability  if 

lim  n/r  <  — (I  +  ln(h  - 1)) 

for  the  case  k  >  3  and  if 

lim  n/r  <  3  for  the  case  t  =  3 

»,r-»o0 

The  two  algorithms  below  have  been  studied  experimentally  using  the  constant- 
clauae-sise  model  with  A  =  3.  These  algorithms  dynamically  assign  weights  to  each 
literal  in  L  and  these  weights  are  used  to  select  the  next  literal.  The  weighting 
functions  are  defined  as  follows: 

w(l)  =  £ 

«*»(!) 

«*(o=  n  a  -  »—*••<•») 

where  5(f)  is  the  collection  of  clauses  in  /  containing  literal  f,  P(l)  is  the  collection  of 
clauses  resulting  from  removing  clauses  containing  l  and  all  occurrences  of  eomp(l) 
from  /  and  sise(c)  is  the  number  of  literals  contained  in  clause  c  if  e  has  not  been 
removed  from  I  and  me(c)  =  oo  otherwise.  It  should  be  noted  that  #i**( e)  *  k 
for  all  e  in  a  random  instance  of  SAT  generated  according  to  the  constant-clause* 
rise  model  but  *sse( c)  changes  as  literals  are  removed  or  when  c  contains  the  next 
chosen  literal  in  the  algorithms  below.  It  should  also  be  noted  that  w*(f)  is  a 
measure  of  the  expected  number  of  solutions  that  exist  for  an  instance  of  SAT  with 
var{ I)  set  so  that  literal  l  has  value  true;  choosing  a  truth  assignment  to  maximise 
the  expected  number  of  solutions  to  the  remainder  of  the  instance  seems  to  be  a 
reasonable  heuristic  and  is  the  basis  for  algorithm  AT. 


AO:  Repeat 

««-/:<€£,  vre£u(0>u(r) 

Remove  from  /  all  clauaea  containing  u 
Remove  from  I  all  occurrence*  of  eomp(«) 

L*-  L  —  {«,  comp(tt)} 

Until  /  is  empty  or  there  exist  two  complementary  unit  danse*  in  I 
If  /  is  empty  Then  Output("a  solution  exists") 

Else  Ontput("cannot  find  a  solution") 

Alt  Repeat 

Vf  €  L  vg(t)  >  vm(F) 

Remove  from  I  all  clauses  containing  « 

Remove  from  I  all  occurrences  of  eomp(u) 

L  4—  L  —  {u,  comp{tt)} 

Until  I  is  empty  or  there  exist  two  complementary  unit  clauses  k  I 
If  I  is  empty  Then  Output("a  solution  exists") 

Else  Output("cannot  find  a  solution") 

In  (5]  the  following  results  of  experiments  on  AO  and  AT  using  the  const  ant- 
clause- size  model  to  generate  random  instances  are  reported: 

6.  Algorithm  AO  solves  SAT  effidently  with  bounded  probability  if 

lim  n/r  <  3.7  when  k  =  3 

»vr-+oo 

and  does  not  solve  SAT  with  probability  approaching  I  if 

lim  n/r  >  3.8  when  h  =  3 

»,r-»oo 


7.  Algorithm  AT  solves  SAT  efficiently  with  bounded  probability  if 

lim  n/r  <  3.6  when  h  =  3 

»,r-*oo 

and  does  not  solve  SAT  with  probability  approaching  I  if 

lim  n/r  >3.7  when  h  *3 

»,r-*oo 


8.  A  randan  instance  of  SAT  generated  according  to  the  const  ant-clause-sise 
model  has  no  solution  with  probability  approaching  I  when  n,  r  -» oo  if 

lim  n/r  <  4  when  k  =  3 

»,r—oo 

0.  A  random  instance  of  SAT  generated  according  to  the  const  ant-clause-sise 
model  has  a  solution  with  probability  approaching  1  when  n,r  — ► oo  if 

lim  n/r  >  4  when  k  =  3 

»,r-*oo 


S.  Interpretation  of  Results 

The  const  ant-clause-sise  model  seems  to  generate  non-trivial  instances  of  SAT  since 
the  simple-minded  algorithms  Al  and  AS  which  work  so  well  on  instances  gener¬ 
ated  by  the  constant-density  model  do  not  work  at  all  well  on  random  instances 
generated  according  to  the  const  ant-clause-sise  model  when  the  limiting  ratio  of 
n/r  is  fixed  (i.e.  a  function  of  k).  The  case  of  the  limiting  ratio  of  n/r  being  fixed  is 
important  since  random  instances  are  “hardest*  when  the  probability  that  a  solu¬ 
tion  exists  is  about  1/2  and  this  occurs  when  the  limiting  ratio  is  fixed.  Despite  the 
relatively  “hard*  instances  generated  fay  the  const  ant-clause-sise  model  a  number  of 
algorithms  have  been  shown  to  “prove"  that  a  solution  to  a  given  random  instance 
I  of  SAT  exists  for  nearly  every  /  that  has  a  solution  when  k  =  3;  these  algorithms 
are  not  quite  as  effective  for  arbitrary  k. 

Perhaps  surprising  is  the  difference  in  the  range  of  n/r  over  which  algorithms 
perform  well  probabilistically.  In  particular,  AS  and  AS  are  not  much  different  in 
structure  but  the  bound  on  the  limiting  ratios  of  n/r  for  which  good  probabilistic 
performance  is  achieved  is  larger  for  AS  by  a  factor  of  about  ln(k).  Furthermore, 
from  a  previous  result  [2],  the  bound  on  ratios  n/r  for  which  good  probabilistic 
performance  of  the  pure  literal  heuristic  is  achieved  does  not  even  increase  with  k 
while  the  bounds  for  A3,A4  and  AS  are  aU  exponential  in  k. 

We  have  been  able  to  rank  a  number  of  algorithms  for  solving  SAT  by  their 
probabilistic  performance.  One  of  these  algorithms  has  been  shown  experimentally 
to  be  extemely  effective  on  instances  of  3-SAT  when  those  instances  have  solutions. 
We  have  not  yet  succeeded  in  producing  an  algorithm  for  SAT  which,  under  the 
const  ant-clause-sise  model,  is  effective  in  determining  that  no  solution  exists  when 
its  input  is  an  instance  with  no  solution.  This  is  the  next  step  in  this  research. 
After  this  we  intend  to  apply  the  algorithms  and  analysis  mentioned  here  to  other 
NP-complete  problems. 
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